1 Purpose

This document summarizes preliminary results of analyses that seek to quantifying the extent to which salmon productivity is correlated with (a) ocean conditions during early marine life and (b) potential salmon competitors later in marine life, and the degree to which these relationships vary over time and space. More details on motivation for project can be found here and all code to reproduce the analyses in this document is here. All findings are preliminary and subject to change.

2 Methods

We are fitting four general classes of spawner-recruitment models to characterize relationships between ocean conditions and salmon productivity over time:

  1. Stationary models that estimate a single time invariant relationship between ocean conditions and salmon productivity (e.g., Connors et al. 2020)

  2. A-priori change point models that allow relationships to vary among periods that corresponded with observed Northeast Pacific regime shifts in 1976/77 and 1988/89, as well as the onset of the recent marine heat wave period in 2014 (e.g., Malick 2020)

  3. Random walk models that allow relationships to evolve through time (e.g., Malick 2020)

  4. Hidden Markov models that allow relationships to vary according to the state (regime) the system is in, where the sequence of states and state specific relationships are estimated using a Hidden Markov Model framework.

Details on model formulation are provided below. These models are fit in a Bayesian estimation framework using Stan and compared using approximate leave-one-out cross validation. Inference will be primarily based on the magnitude, direction, and uncertainty of standardized covariate effects and the degree to which they vary over time and across ocean regions, species, and life histories.

2.1 Data

2.1.1 Spawner-recruitment

Figure 1. Productivity (R/S) time series of all Sockeye stocks included in analyses.

We have been working to update and add to a large collection of time series of spawner (escapement) and total recruitment (catch plus escapement) for all five species of Pacific salmon. Details on the specific timeseries are avaialble here. The spatial and temporal extent of those populations and species that we have fit models to so far are detailed below.

FIGURE: map of sockeye salmon ocean-entry locations.

FIGURE: sparkline/“lie detector” plot of ln(R/S) timeseries

2.1.2 Covariates

Our analysis is focused on two covariates:

  1. Sea surface temperatures near juvenile salmon ocean-entry points to index regional-scale environmental variability during the first few months in the ocean, which are hypothesized to represent a critical period in the Pacific salmon life cycle that can strongly influence year class strength.

  2. The abundance of sockeye, chum, and pink salmon across the North Pacific in the second and/or third years of marine life as an index of potential direct or indirect competition for food. This approach is consistent with research that has suggested some salmon species primarily exhibit responses to competition with other salmon during their second and/or third growing seasons at sea (Connors et al. 2020; Ruggerone and Connors 2015).

Figure 2. Sockeye ocean entry locations and associated climate and competitor time series. Numbered stocks are named in Figure 1. (a) Unique ocean entry points of Sockeye stocks in four Ocean Regions defined by ocean entry points. (b) Time series of Pink salmon abundance index in millions (top row) and SST anomaly (bottom row). Stock-specific timeseries are shown by light lines and regional averages by thick lines.

2.2 Stationary models

In the simplest stationary class of models salmon productivity was modelled hierarchically as a function of spawner abundance, early marine ocean conditions, and potential competitor abundance later in marine life:

\[\begin{equation} ln(R_{i,t}/S_{i,t}) = \alpha_{i} - \beta S_{i,t} + \gamma_{k,i,r} X_{k,i,t} + \varepsilon_{i,t} \tag{2.1} \end{equation}\]

where \(R\) is the total recruitment from population \(i\) and spawners \(S\) in brood year \(t\), \(\alpha\) is productivity (intrinsic rate of growth), \(\beta\) is the magnitude of within brood-year density-dependent effects on survival, \(\gamma\) is the effect of covariate \(k\), \(X\) is the specific covariate standardized to mean 0 and standard deviation of 1, and \(\epsilon\) is inter-annual variation in survival from egg to adulthood which was assumed to follow a first-order autocorrelated process:

\[\begin{equation} \varepsilon_{i,t}=\phi \varepsilon_{i,t-1} + \sqrt{1-\phi^2 }\delta_{i,t} \\ \delta_{i,t}\sim N(0,\sigma_{i}^2) \tag{2.2} \end{equation}\]

where \(\phi\) is the autocorrelation coefficient, and \(\delta\) represents uncorrelated, white noise, interannual variation in survival.

The population-specific parameters \(\gamma_{k,i,r}\) were modelled hierarchically by assuming they arise from common hyper-distributions:

\[\begin{equation} \gamma_{k,i,r}\sim N(\mu_{\gamma,k,r},\sigma_{\gamma,k,r}^2) \tag{2.3} \end{equation}\]

where \(r\) is one of four broad ocean regions within which salmon populations have previously been shown to exhibit similar responses to ocean climate and ecosystem variation and \(\mu\) is the region level average effect.

For Fraser River sockeye stocks (No. 1-19 in Fig. 1), spawner abundances were in units of effective female spawners (i.e., female spawner abundance adjusted for unspawned eggs), whereas for all other stocks spawner abundances were total male and female spawners. Thus, \(\alpha_{i}\) parameters were split into two groups (one group that included Fraser River stocks and another group that included all non-Fraser River stocks) that were exchangeable within each group but not between the two groups.

2.3 Era models

The first class of non-stationary model we considered allowed the \(\gamma_{k,i,r,e}\) parameters to change at predefined points in time corresponding with well documented Northeast Pacific regime shifts in 1976/77 and 1989/99 (Hare and Mantua, 2000) resulting in three eras (\(e\)). As with the stationary models we assumed the era specific climate and ecosystem parameters were exchangeable among populations within the same ocean ecosystem and era thereby estimating both population specific and ocean ecosystem level effects by era.

2.4 Random walk models

The second class of non-stationary models we considered allowed the climate and ecosystem parameters to evolve over time according to a random walk:

\[\begin{equation} \gamma_{k,i,t}=\gamma_{k,i,t-1}+w_{k,t}\\ w_{k,t}\sim N(0,\sigma_{w_{k}}^2) \tag{2.4} \end{equation}\]

where \(w\) is process variation and \(\sigma_{w}^2\) determines the degree if temporal variability in the \(\gamma\) series. For these models the covariate effects were modelled independently instead of being exchangeable among populations within regions (i.e., hierarchically).

2.5 Hidden Markov models

The last class of non-statioanry models combines elements of both the Era and Random walk models by estimating discrete climate and ecosystem effects \(\gamma_{k,i,q}\) dictated by regime state using a hidden Markov model approach. The estimated time-series of hidden regime states \(z_{t}\) emerges from the evidence from observations \(y_{t}\) and is modelled as a Markov process - ie. the year-to-year regime states depend only on the state in the previous time-step (\(p(z_{t}|z_{t-1})\)) according to an estimated transition probability matrix. The hidden Markov model approach is conceptually similar to the Random walk, but where the coefficient effects are pooled into discrete ordered distributions (ie. low or high estimates in 2 states) with the subsequent probability of being in each state is evaluated at each time-step. In contrast to the Era model, the hidden regime states through time are inferred from the data rather than set a priori (ie. the regimes are ‘unsupervised’). The model is evaluated based on the joint likelihood of the observations given each regime \(p(y_{1:T}|z_{1:T})\) and the regime sequence \(p(z_{1:T})\).

2.6 Model fitting

We fit the model described above in a Bayesian estimation framework with Stan Stan Development Team 2020, which implements the No-U-Turn Hamiltonian Markov chain Monte Carlo algorithm Hoffman et al. 2014 for Bayesian statistical inference to generate the joint posterior probability distribution of all unknowns in the model. For each model run, we sampled from 4 chains with 4,000 iterations each and discarded the first 1,000 as warm-up. We assessed chain convergence visually via trace plots and posterior predictive checks and by ensuring that \(\hat{R}\) [potential scale reduction factor; Vehtari et al. 2021] was less than 1.01 and that the effective sample size was greater than 400 for each parameter in the model.

3 Results

3.1 Stationary models

Figure 3. Posterior probability distributions of the predicted effect of (a) SST, (b) competitors, and (c) the combined effect from all covariate terms, on sockeye salmon survival. Overall hyperdistribution of the covariate effects are in bold lines, with individual stock-specific distributions illustrated by the light lines. Covariate effects are standardized (i.e., per standard deviation unit increase in each covariate).

Figure 4. Stock-specific posterior mean (points) and 95% CI (horizontal bars) estimates of SST and Competitor covariate effects (left and right, respectively). Vertical lines and shaded areas are group mean effects and 95% CI.

3.2 A-priori models

Figure 5. Posterior probability distributions of the predicted effect of SST and competitors on Sockeye productivity over three pre-defined time periods (eras). Breakpoints between eras correspond with documented NP ocean regime shifts. Regional mean (hyperdistribution) covariate effects are in bold lines and individual stocks’ distributions are in light lines.

Figure 6. Stock-specific posterior mean (points) estimates of SST and Competitor covariate effects through three pre-defined eras: <1976 (lightest shades), 1977-1988 (middle shades), and >1989 (darkest shades). Vertical lines and shaded areas are group mean effects and 80% CI.

3.3 Random walk models

Figure 7. Time-varying posterior mean estimates of SST and Competitor covariate effects, which vary from year to year as a random walk. Individual stock estimates are in faint lines, while regional means and 90% CI are represented by bold lines and shaded areas. Regional summaries are post-hoc calculations, as random walk parameters were not grouped hierarchically.

3.4 Hidden Markov models

Figure 8. Time-varying mean estimates of SST and Competitor covariate effects from hidden Markov models. Time-varying effects are calculated from the estimated probability of each hidden state in a given year, and covariate coefficients for each state (Figure 9). Individual stock estimates are in faint lines, while regional means and 90% CI are represented by bold lines and shaded areas. Regional summaries are post-hoc calculations, as each stock was modelled seperately.

Figure 9. Posterior estimates of SST and Competitor coefficients for each hidden state (n=2) in stock-specific hidden markov models. Vertical ticks show posterior means, shaded areas are 80% CI, and horizontal bars are 95% CI. States are determined by SST, and a competitor effect is then determined for each state.

4 Next steps

  • Explore alternative temporal and spatial domains over which to average SST anomalies to derive the early ocean temperature covariate

  • Update sockeye, pink, and chum spawner-recruitment time-series based on data from ADF&G and DFO

  • Incorporate Chinook and coho spawner-recruitment time series into analysis